EMU: an Enhanced Hierarchical Speech Data Management System
نویسندگان
چکیده
EMU is a system for labelling, managing and retrieving data from speech databases such as the Australian ANDOSL database or the US TIMIT. EMU is a re-implementation of the earlier MU+ system (Harrington, Cassidy, Fletcher, and McVeigh 1993) with the aim of providing a more flexible environment. The hierarchical structures and database query facility have been generalised and the system has been extended to include an interactive labeller with spectrogram and waveform displays. EMU incorporates the Tcl/Tk scripting language which can be used to extend the labeller and to perform many automated operations on databases; as an example, scripts have been written to automatically construct hierarchical descriptions given Phonetic level labels. The need for increased flexibility was driven largely by the desire to use the system on languages other than English. This paper concludes by describing a database for Cantonese, and a database used in a kinematic study of vowel lengthening, both of which include facilities for automatically generating hierarchies.
منابع مشابه
Compiling multi-tiered speech databases into the relational model: experiments with the emu system
The Emu speech database system enables the annotation of speech signals at many levels of detail and provides a mechanism for making links between these levels to produce a hierarchical annotation. Emu provides facilities for searching collections of these annotations according to both sequential and hierarchical criteria. The results of a search can be used to retrieve acoustic and other data ...
متن کاملMulti-level Annotation of Speech: An Overview of The Emu Speech Database Management System
Researchers in various fields, from acoustic phonetics to child language development, rely on digitised collections of spoken language data as raw material for research. Access to this data has, in the past, been provided in an ad-hoc manner with labelling standards and software tools developed to serve only one or two projects. A few attempts have been made at providing generalised access to s...
متن کاملMulti-level annotation in the Emu speech database management system
Researchers in various ®elds, from acoustic phonetics to child language development, rely on digitised collections of spoken language data as raw material for research. Access to this data had, in the past, been provided in an ad-hoc manner with labelling standards and software tools developed to serve only one or two projects. A few attempts have been made at providing generalised access to sp...
متن کاملExtending the EMU Speech Database Management System: Cloud Hosting, Team Collaboration, Automatic Revision Control
In this paper, we introduce a new component of the EMU Speech Database Management System [1, 2] to improve the team workflow of handling production data (both acoustic and physiological) in phonetics and the speech sciences. It is named emuDB Manager, and it facilitates the coordination of team efforts, possibly distributed over several nations, by introducing automatic revision control (based ...
متن کاملManaging speech databases with emur and the EMU-webapp
As is the nature of the discipline, a majority of speech and language researchers spend a large amount of their time acquiring and transforming data into analyzable and interpretable forms to gain a better understanding of a certain subject matter. In this paper we present a collection of tools that aid the researcher in this sometimes tedious and error-prone process. The tools presented here a...
متن کامل